Recap last lecture

  • an abundance of data sources 📃
    • often PDF, few datasets
  • creating your own dataset 🪛
    • convert PDF to .txt
  • questions regarding the 2nd assignment?

Outline

  • Get ready for the mini-project 📝
  • Curate data on SwissDox 📰
  • Perform media analysis with Pandas 🐼

Mini-project

Present your project on 23 May 2025

  • analyze any collection of text documents

    • compare historically

    • compare between actors

  • form groups of 2-3 people

  • requirements

    • apply quantitative measures on multiple documents
    • interpret and present results in class
    • share executable script

❗ share your project idea here by 1 May 2025

Optional seminar paper

  • writing a seminar paper (6 ECTS)
  • get in touch to discuss your idea

Let’s start with real data-science